Learning from Human-Generated Lists

نویسندگان

  • Kwang-Sung Jun
  • Xiaojin Zhu
  • Burr Settles
  • Timothy T. Rogers
چکیده

Human-generated lists are a form of non-iid data with important applications in machine learning and cognitive psychology. We propose a generative model — sampling with reduced replacement (SWIRL) — for such lists. We discuss SWIRL’s relation to standard sampling paradigms, provide the maximum likelihood estimate for learning, and demonstrate its value with two real-world applications: (i) In a “feature volunteering” task where non-experts spontaneously generate feature⇒label pairs for text classification, SWIRL improves the accuracy of state-ofthe-art feature-learning frameworks. (ii) In a “verbal fluency” task where brain-damaged patients generate word lists when prompted with a category, SWIRL parameters align well with existing psychological theories, and our model can classify healthy people vs. patients from the lists they generate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cycle Time Optimization of Processes Using an Entropy-Based Learning for Task Allocation

Cycle time optimization could be one of the great challenges in business process management. Although there is much research on this subject, task similarities have been paid little attention. In this paper, a new approach is proposed to optimize cycle time by minimizing entropy of work lists in resource allocation while keeping workloads balanced. The idea of the entropy of work lists comes fr...

متن کامل

Automatic term list generation for entity tagging

MOTIVATION Many entity taggers and information extraction systems make use of lists of terms of entities such as people, places, genes or chemicals. These lists have traditionally been constructed manually. We show that distributional clustering methods which group words based on the contexts that they appear in, including neighboring words and syntactic relations extracted using a shallow pars...

متن کامل

Material Development and English for Academic Purposes Word Lists; a Reductionist Approach

Nagy (1988) states that vocabulary is a prerequisite factor in comprehension. Drawing upon a reductionist approach and having in mind the prospects for material development, this study aimed at creating an English for Academic Purposes Word List (EAPWL). The corpus of this study was compiled from a corpus containing 6479 pages of texts, 2,081,678 million tokens (running words) and 63825 types (...

متن کامل

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

Word Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction

The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013